MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

نویسندگان

  • Douglas Burdick
  • Manuel Calimlim
  • Johannes Gehrke
چکیده

We present a new algorithm for mining maximal frequent itemsets from a transactional database. Our algorithm is especially efficient when the itemsets in the database are very long. The search strategy of our algorithm integrates a depth-first traversal of the itemset lattice with effective pruning mechanisms. Our implementation of the search strategy combines a vertical bitmap representation of the database with an efficient relative bitmap compression schema. In a thorough experimental analysis of our algorithm on real data, we isolate the effect of the individual components of the algorithm. Our performance numbers show that our algorithm outperforms previous work by a factor of three to five.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MaRFI: Maximal Regular Frequent Itemset Mining using a pair of Transaction-ids

Frequent pattern mining is the fundamental and most dominant research area in data mining. Maximal frequent patterns are one of the compact representations of frequent itemsets. There is more number of algorithms to find maximal frequent patterns that are suitable for mining transactional databases. Users not only interested in occurrence frequency but may be interested on frequent patterns tha...

متن کامل

MAFIA: A Performance Study of Mining Maximal Frequent Itemsets

We present a performance study of the MAFIA algorithm for mining maximal frequent itemsets from a transactional database. In a thorough experimental analysis, we isolate the effects of individual components of MAFIA, including search space pruning techniques and adaptive compression. We also compare our performance with previous work by running tests on very different types of datasets. Our exp...

متن کامل

Memory Efficient Mining of Maximal Itemsets using Order Preserving Generators

In this paper, we propose a memory efficient algorithm for maximal frequent itemset mining from transactional datasets. We propose OP-MAX* (Order Preserving – MAXimal itemset mining) algorithm, which mines all the maximal itemsets from transactional datasets with less space and time. Our methodology uses a memory efficient maximality checking technique to generate frequent maximal itemsets. We ...

متن کامل

A Review on Algorithms for Mining Frequent Itemset Over Data Stream

Frequent itemset mining over dynamic data is an important problem in the context of data mining. The two main factors of data stream mining algorithm are memory usage and runtime, since they are limited resources. Mining frequent pattern in data streams, like traditional database and many other types of databases, has been studied popularly in data mining research. Many applications like stock ...

متن کامل

A Novel Algorithm for Mining Hybrid-Dimensional Association Rules

The important issue for association rules generation is the discovery of frequent itemset in data mining. Most of the existing real time transactional databases are multidimensional in nature. The classical Apriori algorithm mainly concerned with handling single level, single-dimensional boolean association rules. These algorithms scan the transactional databases or datasets many times to find ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001